In this study, I will try to get insights about the esoph data to see if (o)esophageal cancer is related to alcohol consumption, age or tobacco consumption, and will try to get meaning from Young People Survey’s Hobbies & Interests data by using some statistical methods and visualizations. For the data source, you can find the link at the end of the page.

1 ESOPH ANALYSIS

Esoph is R-Data from a case-control study of (o)esophageal cancer in Ille-et-Vilaine, France. Let’s investigate it first!

esoph_data<- esoph
summary(esoph_data)
##    agegp          alcgp         tobgp        ncases         ncontrols    
##  25-34:15   0-39g/day:23   0-9g/day:24   Min.   : 0.000   Min.   : 1.00  
##  35-44:15   40-79    :23   10-19   :24   1st Qu.: 0.000   1st Qu.: 3.00  
##  45-54:16   80-119   :21   20-29   :20   Median : 1.000   Median : 6.00  
##  55-64:16   120+     :21   30+     :20   Mean   : 2.273   Mean   :11.08  
##  65-74:15                                3rd Qu.: 4.000   3rd Qu.:14.00  
##  75+  :11                                Max.   :17.000   Max.   :60.00

Our data consists of three categorical variables namely agegp, alcgp two numerical columns, ncases and ncontrols. Number of cases(ncases) ranges from 0 to 17 while number of controls(ncontrols) ranges from 1 to 60.

str(esoph_data)
## 'data.frame':    88 obs. of  5 variables:
##  $ agegp    : Ord.factor w/ 6 levels "25-34"<"35-44"<..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ alcgp    : Ord.factor w/ 4 levels "0-39g/day"<"40-79"<..: 1 1 1 1 2 2 2 2 3 3 ...
##  $ tobgp    : Ord.factor w/ 4 levels "0-9g/day"<"10-19"<..: 1 2 3 4 1 2 3 4 1 2 ...
##  $ ncases   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ncontrols: num  40 10 6 5 27 7 4 7 2 1 ...

Here, we can see variable types better. esoph_data has 88 observations of 5 variables.

1.1 Understanding Total Number of Cases

Below, you can see that the distribution of number of cancer cases are left skewed and 0 cases constitutes 29 of the cancer cases.

a<-ggplot(esoph_data,aes(esoph_data$ncases))+geom_bar(fill="blue",alpha=0.5)+labs(x= 'Number of Cases', y= 'Count')
plotly::ggplotly(a)

## Understanding Age Group and Percentage of Cancer Cases Relation

Below, you can see that percentage of cancer cases tends to be higher when getting older. However, the highest cancer percentage is observed between 65-74 ages and lowest is observed between the 25-34 age group.

b<-esoph_data %>% group_by(agegp)%>%
  summarise(total_cases_by_age=sum(ncases),total_control_By_age=sum(ncontrols),perc=total_cases_by_age/total_control_By_age) %>%
ggplot(.,aes(x=agegp, y=perc, fill=agegp)) +
  labs(x="Age Group",y="Percentage of Cases",title="Relation between Percentage of Cancer Cases and Age Group")+ 
  geom_bar(stat="identity", position = "dodge")+
  theme(axis.text.x = element_text(size=7,vjust=0.4))+
  guides(fill=guide_legend(title="Age Groups"))+
  theme_minimal()
plotly::ggplotly(b)

1.2 Understanding Age-Group and Number of Cases Relation

Below, you can see the distribution of the number of cases by all six age groups. The highest median is found to be 4 in 55-64 age group and maximum number of cases is observed as 17 in 65-74 age group. You can analyze the distribution details by scrolling through the each box blot.

c<-ggplot(esoph_data, aes(x=agegp, y=ncases, fill=agegp)) +
  labs(x="Age Group",y="Number of Cases",title="Relation between Number of Cases and Age Group")+ 
  geom_boxplot()+
  theme(axis.text.x = element_text(size=7,vjust=0.4))+theme_minimal()
plotly::ggplotly(c)

1.3 Understanding Tobacco Consumption and Percentage of Cases Relation

Below, the tobacco consumption of each age group can be seen in six graphs. In the age groups of 45-54 and 55-64, the percentage of cancer cases are significantly higher when tobacco consumption is more than 30 g/day. In the age groups of 25-34 and 35-44, the number and percentage of cancer cases are observed significantly less than other groups.

d<-esoph_data %>%  group_by(agegp,tobgp) %>%
     summarise(total_cases = sum(ncases), total_controls = sum(ncontrols),percentage=total_cases/total_controls)

d1<-ggplot(d,aes(x=tobgp,y=percentage,fill=tobgp))+
    geom_bar(stat="identity")+
    facet_wrap(~agegp,scales="fixed")+
    guides(fill=guide_legend(title="Tobacco cons"))+
    labs(title="Tobacco Consumption vs Percentage of Cancer Cases by Age Group",x="Tobacco Consumption", y= 'Percentage Of Cancer Cases')
    
plotly::ggplotly(d1)

1.4 Understanding Alcohol Consumption and Percentage of Cases Relation

In the age groups of 25-34 and 35-44, contrary to tobacco consumption, when the alcohol consumption is higher than 120g/day, cancer can be observed. Again, in each age group, percentage of cancer cases tends to be proportional with the alcohol consumption per day.

e<-esoph %>%  group_by(agegp,alcgp) %>%
     summarise(total_cases = sum(ncases), total_controls = sum(ncontrols),percentage=total_cases*100/total_controls)

e1<-ggplot(e,aes(x=alcgp,y=percentage,fill=alcgp))+
    geom_bar(stat="identity")+
    facet_wrap(~agegp,scales="fixed")+
    guides(fill=guide_legend(title="Alcohol cons"))+
    labs(title="Alcohol Consumption vs Percentage of Cancer Cases by Age Group",x="Alcohol Consumption", y= 'Percentage Of Cancer Cases')
    
plotly::ggplotly(e1)

1.5 Understanding Age Group and Number of Controls Relation

As can be seen from the box plots, number of controls tends to be lower in 25-34 and 75+ age group and maximum control number is observed in 35-44 age group. For a detailed information, you can scroll through the graphs.

f<-ggplot(esoph_data, aes(x=agegp, y=ncontrols, fill=agegp)) +
  labs(x="Age Group",y="Number of Controls",title="Relation between Number of Controls and Age Group")+ 
  geom_boxplot()+
  theme(axis.text.x = element_text(size=7,vjust=0.4))+theme_minimal()
plotly::ggplotly(f)

1.6 Understanding Number of Cases over Number of Controls

Below, you can see number of cases over number of controls points are distributed along the lowest left corner of the plot. The outliers in both number of controls and number of cases can be seen better now.

g<-ggplot(esoph_data,aes(x=ncases,y=ncontrols,group=1,color=agegp))+
  ggtitle(label="Number of Cases over Number of Controls")+
  theme(plot.title = element_text(hjust=1))+
  geom_point()+
  theme_minimal() +
  labs(x = "Number of Cases",y = "Number of Controls", color="Age group") +
  theme(axis.text.x = element_text(size=7,vjust=0.4))
plotly::ggplotly(g)

1.7 Understanding the Effect of Alcohol and Tobacco

## effects of alcohol, tobacco and interaction, age-adjusted
model1 <- glm(cbind(ncases, ncontrols) ~ agegp + tobgp * alcgp,
              data = esoph, family = binomial())
anova(model1)
## Analysis of Deviance Table
## 
## Model: binomial, link: logit
## 
## Response: cbind(ncases, ncontrols)
## 
## Terms added sequentially (first to last)
## 
## 
##             Df Deviance Resid. Df Resid. Dev
## NULL                           87    227.241
## agegp        5   88.128        82    139.112
## tobgp        3   19.085        79    120.028
## alcgp        3   66.054        76     53.973
## tobgp:alcgp  9    6.489        67     47.484
## Try a linear effect of alcohol and tobacco
model2 <- glm(cbind(ncases, ncontrols) ~ agegp + unclass(tobgp)
                                         + unclass(alcgp),
              data = esoph, family = binomial())
summary(model2)
## 
## Call:
## glm(formula = cbind(ncases, ncontrols) ~ agegp + unclass(tobgp) + 
##     unclass(alcgp), family = binomial(), data = esoph)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.7628  -0.6426  -0.2709   0.3043   2.0421  
## 
## Coefficients:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)    -4.01097    0.31224 -12.846  < 2e-16 ***
## agegp.L         2.96113    0.65092   4.549 5.39e-06 ***
## agegp.Q        -1.33735    0.58918  -2.270  0.02322 *  
## agegp.C         0.15292    0.44792   0.341  0.73281    
## agegp^4         0.06668    0.30776   0.217  0.82848    
## agegp^5        -0.20288    0.19523  -1.039  0.29872    
## unclass(tobgp)  0.26162    0.08198   3.191  0.00142 ** 
## unclass(alcgp)  0.65308    0.08452   7.727 1.10e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 227.241  on 87  degrees of freedom
## Residual deviance:  59.277  on 80  degrees of freedom
## AIC: 222.76
## 
## Number of Fisher Scoring iterations: 6

Our median residuals is close to zero which means our model is not biased in one direction (i.e. the out come is neither over- nor underestimated). The contributions of age group and alcohol group to the deviance is higher than tobgp and both tobgp:alcgp, as can be seen from the first model.

From the second generalized linear model, we can observe that agegp.L, tobgp and alcgp has a positive effect, whereas agegp.Q and agegp^5 has negative effect. tobgp, alcgp, agegp.L and agegp.Q are statistically significant. Null deviance is relatively higher than residual deviance, which is a good sign! This means using more than a single parameter explains the model better.

2 YOUNG PEOPLE SURVEY ANALYSIS

First let’s read the data.

young_ppl_survey<-read.csv("responses.csv")

Now, we get the related columns that we will work on and clean the data.

filtered_survey<-young_ppl_survey[,32:63]
filtered_survey=filtered_survey[complete.cases(filtered_survey),]

Now, our data is ready for some interesting insights! First, let’s see if we can reduce the dimensionality of this large data set while keeping as much information as possible. For that, we will use Principal Component Analysis(PCA).

2.1 Principal Component Analysis

##                 History    Psychology   Politics Mathematics    Physics
## History     1.000000000  0.2963735855 0.40623429 0.008786356 0.07088193
## Psychology  0.296373585  1.0000000000 0.19108731 0.043974943 0.07142061
## Politics    0.406234287  0.1910873113 1.00000000 0.103492328 0.13388052
## Mathematics 0.008786356  0.0439749434 0.10349233 1.000000000 0.60785444
## Physics     0.070881929  0.0714206142 0.13388052 0.607854436 1.00000000
## Internet    0.006612009 -0.0003077005 0.04428471 0.156686522 0.10666200
##                  Internet
## History      0.0066120093
## Psychology  -0.0003077005
## Politics     0.0442847125
## Mathematics  0.1566865225
## Physics      0.1066619993
## Internet     1.0000000000

Here, we see that mathematics and physics are positively correlated, unsurprisingly!

survey_pca <- princomp(as.matrix(filtered_survey),cor=T)
summary(survey_pca,loadings=TRUE)
## Importance of components:
##                           Comp.1    Comp.2     Comp.3     Comp.4     Comp.5
## Standard deviation     2.0433958 1.8069970 1.60438109 1.46391156 1.26463164
## Proportion of Variance 0.1304833 0.1020387 0.08043871 0.06696991 0.04997791
## Cumulative Proportion  0.1304833 0.2325220 0.31296072 0.37993063 0.42990854
##                            Comp.6     Comp.7     Comp.8     Comp.9    Comp.10
## Standard deviation     1.18461175 1.06955650 1.05602251 1.04363010 1.00633062
## Proportion of Variance 0.04385328 0.03574847 0.03484949 0.03403637 0.03164692
## Cumulative Proportion  0.47376182 0.50951029 0.54435978 0.57839615 0.61004306
##                           Comp.11    Comp.12    Comp.13    Comp.14    Comp.15
## Standard deviation     0.95951254 0.93789360 0.93168719 0.89503805 0.87262765
## Proportion of Variance 0.02877076 0.02748889 0.02712628 0.02503416 0.02379622
## Cumulative Proportion  0.63881382 0.66630271 0.69342899 0.71846315 0.74225937
##                           Comp.16   Comp.17    Comp.18    Comp.19    Comp.20
## Standard deviation     0.85510068 0.8329419 0.80296355 0.78632469 0.76747805
## Proportion of Variance 0.02284991 0.0216810 0.02014845 0.01932208 0.01840695
## Cumulative Proportion  0.76510928 0.7867903 0.80693874 0.82626082 0.84466777
##                           Comp.21    Comp.22    Comp.23    Comp.24    Comp.25
## Standard deviation     0.74821762 0.72513227 0.71054805 0.70905599 0.69087699
## Proportion of Variance 0.01749468 0.01643178 0.01577745 0.01571126 0.01491597
## Cumulative Proportion  0.86216245 0.87859422 0.89437168 0.91008294 0.92499891
##                           Comp.26    Comp.27    Comp.28   Comp.29     Comp.30
## Standard deviation     0.65750430 0.64296368 0.60888823 0.5954011 0.551814415
## Proportion of Variance 0.01350975 0.01291882 0.01158578 0.0110782 0.009515598
## Cumulative Proportion  0.93850866 0.95142748 0.96301326 0.9740915 0.983607058
##                            Comp.31     Comp.32
## Standard deviation     0.539017589 0.483770803
## Proportion of Variance 0.009079374 0.007313568
## Cumulative Proportion  0.992686432 1.000000000
## 
## Loadings:
##                        Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## History                 0.196         0.184  0.243  0.130  0.243  0.142  0.194
## Psychology              0.244                0.119  0.112        -0.102       
## Politics                0.131  0.202  0.255  0.204  0.270  0.153        -0.108
## Mathematics                    0.295 -0.126  0.133        -0.309 -0.220 -0.273
## Physics                        0.344 -0.241  0.136        -0.155 -0.151 -0.151
## Internet                       0.237  0.127               -0.400         0.370
## PC                             0.377               -0.134 -0.291  0.137  0.144
## Economy.Management             0.174  0.306         0.214 -0.182 -0.166 -0.288
## Biology                 0.269        -0.377 -0.151  0.221                     
## Chemistry               0.199        -0.421         0.230                     
## Reading                 0.282 -0.188         0.176                       0.173
## Geography               0.160  0.145  0.151                0.218  0.226  0.243
## Foreign.languages       0.208         0.217               -0.117 -0.165  0.227
## Medicine                0.269        -0.317 -0.118  0.266                     
## Law                     0.143  0.147  0.277         0.363  0.116        -0.142
## Cars                           0.350        -0.168                            
## Art.exhibitions         0.313                      -0.201                     
## Religion                0.230                0.166                0.128       
## Countryside..outdoors   0.194               -0.102 -0.334         0.167  0.119
## Dancing                 0.241               -0.230               -0.107 -0.326
## Musical.instruments     0.210                      -0.361               -0.158
## Writing                 0.231                0.161 -0.194 -0.114  0.214 -0.171
## Passive.sport                  0.162        -0.222 -0.135  0.116              
## Active.sport                   0.200        -0.241 -0.138  0.340        -0.286
## Gardening               0.192               -0.178                0.445 -0.200
## Celebrities                           0.183 -0.351  0.181 -0.289  0.234       
## Shopping                      -0.136  0.178 -0.399  0.153 -0.241              
## Science.and.technology         0.348                                     0.195
## Theatre                 0.302 -0.128               -0.144        -0.256  0.115
## Fun.with.friends                      0.135 -0.260 -0.154        -0.498  0.208
## Adrenaline.sports              0.254        -0.221 -0.200  0.313 -0.179       
## Pets                                        -0.256                0.138  0.132
##                        Comp.9 Comp.10 Comp.11 Comp.12 Comp.13 Comp.14 Comp.15
## History                 0.162                                  0.252   0.233 
## Psychology              0.214 -0.181   0.232   0.272   0.536   0.153  -0.266 
## Politics                               0.139          -0.187           0.146 
## Mathematics                    0.360          -0.129   0.147                 
## Physics                 0.127  0.160                           0.130         
## Internet                      -0.107   0.147           0.264   0.174   0.157 
## PC                            -0.183                                   0.100 
## Economy.Management     -0.163  0.189                          -0.190  -0.148 
## Biology                                                                      
## Chemistry                                     -0.142                   0.101 
## Reading                        0.116  -0.130  -0.150                   0.177 
## Geography              -0.335  0.105  -0.349  -0.228   0.113   0.155         
## Foreign.languages      -0.364 -0.122  -0.187           0.114  -0.408         
## Medicine               -0.135 -0.178                          -0.101         
## Law                                    0.135          -0.179  -0.158         
## Cars                    0.101                  0.125  -0.331  -0.100  -0.174 
## Art.exhibitions         0.166                         -0.156   0.114  -0.343 
## Religion               -0.220          0.370   0.456  -0.146   0.147   0.103 
## Countryside..outdoors  -0.240  0.371           0.412                         
## Dancing                -0.204         -0.108   0.114   0.102   0.145   0.200 
## Musical.instruments    -0.106 -0.226   0.270  -0.141  -0.110  -0.428   0.197 
## Writing                 0.212 -0.319          -0.309   0.149                 
## Passive.sport          -0.150  0.259   0.560  -0.466   0.125          -0.296 
## Active.sport                  -0.204  -0.214           0.153   0.175         
## Gardening               0.108  0.220                  -0.109           0.146 
## Celebrities                                                    0.185         
## Shopping                                              -0.108   0.105         
## Science.and.technology  0.194 -0.169  -0.186   0.121  -0.194          -0.182 
## Theatre                 0.190  0.145                  -0.258          -0.252 
## Fun.with.friends                       0.173          -0.147   0.130   0.471 
## Adrenaline.sports       0.103 -0.120                   0.141                 
## Pets                    0.467  0.336           0.108   0.306  -0.465   0.209 
##                        Comp.16 Comp.17 Comp.18 Comp.19 Comp.20 Comp.21 Comp.22
## History                         0.141           0.130   0.369                 
## Psychology             -0.125           0.116           0.242   0.163         
## Politics                                0.149   0.140                  -0.154 
## Mathematics             0.183  -0.171                   0.169                 
## Physics                 0.150  -0.125   0.126                           0.162 
## Internet                        0.195          -0.216  -0.225  -0.113         
## PC                              0.226          -0.103                  -0.249 
## Economy.Management     -0.333                          -0.159          -0.224 
## Biology                                                                -0.141 
## Chemistry                                                      -0.102         
## Reading                                -0.411                  -0.271   0.131 
## Geography                      -0.236   0.302          -0.223   0.285         
## Foreign.languages                      -0.329   0.154   0.303   0.202   0.111 
## Medicine                                       -0.165                         
## Law                             0.187          -0.301          -0.262   0.183 
## Cars                    0.164                  -0.145           0.380   0.464 
## Art.exhibitions         0.122                  -0.313           0.286  -0.199 
## Religion                0.125  -0.435  -0.322          -0.259                 
## Countryside..outdoors                   0.192           0.189  -0.397   0.125 
## Dancing                         0.345   0.202   0.211  -0.311   0.108   0.310 
## Musical.instruments     0.200           0.303           0.213          -0.230 
## Writing                -0.169  -0.250                  -0.224  -0.178   0.358 
## Passive.sport                          -0.128   0.274                         
## Active.sport            0.220   0.172  -0.432   0.108                  -0.301 
## Gardening              -0.564          -0.193           0.189   0.236         
## Celebrities             0.248  -0.252   0.131           0.254  -0.187  -0.201 
## Shopping                0.101  -0.178           0.282   0.122           0.118 
## Science.and.technology -0.189                   0.554  -0.102  -0.148         
## Theatre                         0.185                  -0.147          -0.153 
## Fun.with.friends       -0.304  -0.160                           0.126         
## Adrenaline.sports      -0.111  -0.365          -0.278          -0.278         
## Pets                    0.240                          -0.263   0.152         
##                        Comp.23 Comp.24 Comp.25 Comp.26 Comp.27 Comp.28 Comp.29
## History                 0.129   0.487                   0.247   0.164         
## Psychology                     -0.228  -0.162  -0.152  -0.144  -0.131         
## Politics                0.119  -0.337   0.355   0.335  -0.385           0.120 
## Mathematics                                                            -0.195 
## Physics                -0.172  -0.129                                   0.263 
## Internet                                0.219          -0.206   0.359  -0.223 
## PC                                                      0.189  -0.442   0.499 
## Economy.Management      0.180   0.443  -0.235          -0.101                 
## Biology                         0.102                                         
## Chemistry                       0.221   0.135   0.165                         
## Reading                 0.303          -0.144  -0.245  -0.303  -0.348  -0.174 
## Geography                              -0.175  -0.272                         
## Foreign.languages              -0.174   0.149   0.178   0.190   0.137   0.118 
## Medicine                               -0.110                   0.155         
## Law                    -0.438  -0.128          -0.253   0.319                 
## Cars                    0.262          -0.220          -0.261                 
## Art.exhibitions        -0.126           0.242   0.332   0.122  -0.262  -0.311 
## Religion                                                0.142                 
## Countryside..outdoors  -0.135          -0.142   0.280  -0.132                 
## Dancing                 0.274  -0.102   0.142           0.260                 
## Musical.instruments             0.103          -0.257  -0.171          -0.179 
## Writing                -0.110   0.165  -0.197   0.289                   0.140 
## Passive.sport                                           0.112  -0.129         
## Active.sport           -0.258          -0.139   0.125  -0.204   0.126         
## Gardening                      -0.112   0.157  -0.193           0.135         
## Celebrities             0.318  -0.237  -0.228   0.169   0.194                 
## Shopping               -0.342   0.322   0.255  -0.186  -0.295  -0.215   0.163 
## Science.and.technology         -0.107                   0.175          -0.435 
## Theatre                                        -0.202           0.486   0.372 
## Fun.with.friends       -0.126          -0.305   0.136                         
## Adrenaline.sports       0.290           0.419  -0.245                         
## Pets                                                                          
##                        Comp.30 Comp.31 Comp.32
## History                 0.179                 
## Psychology             -0.195                 
## Politics                        0.113         
## Mathematics                     0.575         
## Physics                 0.123  -0.650         
## Internet                                      
## PC                              0.162         
## Economy.Management             -0.239         
## Biology                                -0.792 
## Chemistry              -0.623           0.390 
## Reading                        -0.155         
## Geography                                     
## Foreign.languages                             
## Medicine                0.632           0.397 
## Law                                           
## Cars                                          
## Art.exhibitions                -0.102         
## Religion               -0.111                 
## Countryside..outdoors                         
## Dancing                                       
## Musical.instruments                           
## Writing                         0.150         
## Passive.sport                                 
## Active.sport                                  
## Gardening                                     
## Celebrities                                   
## Shopping                0.134                 
## Science.and.technology                        
## Theatre                -0.175   0.161         
## Fun.with.friends                              
## Adrenaline.sports                             
## Pets

From the analysis, it can be observed that first 8 component explains more than %50 of the total variance. Reading affects first component positively while second component is affected by reading negatively.

More than %80 of the variance is explained by only 20 principal components. Here, you can see the cumulative variance by principal components.

2.2 Multidimensional Scaling

In this part, in order to provide a good visual representation, we will use multidimensional scaling technique. The key here is the usage of correlation matrix as similarity measure.

#Set the seed for reproducibility
set.seed(120)
# Here 1 makes the distance values positive!
yr_dist<- 1 - cor(filtered_survey)
yr_mds <- cmdscale(yr_dist,k=2)
#Provide column names
colnames(yr_mds) <- c("x","y")
print(yr_mds)
##                                  x            y
## History                 0.06372563  0.012956654
## Psychology              0.23826552 -0.024422880
## Politics               -0.14749496  0.107881476
## Mathematics            -0.32345626 -0.297702691
## Physics                -0.32464296 -0.499240678
## Internet               -0.43673303  0.187118496
## PC                     -0.58516404 -0.107666761
## Economy.Management     -0.30422083  0.383264449
## Biology                 0.25483318 -0.395803325
## Chemistry               0.12745384 -0.486209735
## Reading                 0.54412403 -0.002498553
## Geography              -0.04741594  0.042633809
## Foreign.languages       0.20488850  0.223464512
## Medicine                0.21225026 -0.352629731
## Law                    -0.07751035  0.252871501
## Cars                   -0.58387376  0.046912232
## Art.exhibitions         0.37143545  0.009747048
## Religion                0.19597972 -0.174122589
## Countryside..outdoors   0.14091867 -0.071530114
## Dancing                 0.27743365  0.136486519
## Musical.instruments     0.16521053 -0.134880364
## Writing                 0.30297822 -0.032446839
## Passive.sport          -0.30096850  0.108795592
## Active.sport           -0.24240576  0.047405197
## Gardening               0.19439442 -0.077481786
## Celebrities             0.06494596  0.467218540
## Shopping                0.20807212  0.463704403
## Science.and.technology -0.36445481 -0.278443856
## Theatre                 0.44547786  0.054248884
## Fun.with.friends       -0.05316373  0.253721685
## Adrenaline.sports      -0.30291870  0.041430796
## Pets                    0.08203608  0.095218109
ggplot(data.frame(yr_mds),aes(x=x,y=y)) + geom_text(label=rownames(yr_mds),angle=45,size=3)

Now, we have a better insight about the hobbies and interests of young people! As we figured out before, physics has close relationship with mathematics,although not as close as Science.and.technology and mathematics! Chemistry, medicine and biology also are three very close interests. Unsurprisingly again, writing, art.exhibitions and theater have close relationship with each other.